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Abstract 

We investigate the randomized and quantum communication complexity of the Hamming Dis- 
tance problem, which is to determine if the Hamming distance between two n-bit strings is no 
less than a threshold d. We prove a quantum lower bound of £l(d) qubits in the general interactive 
model with shared prior entanglement. We also construct a classical protocol of 0(d log d) bits 
in the restricted Simultaneous Message Passing model, improving previous protocols of 0(d 2 ) bits 
(A. C.-C. Yao, Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing, 
pp. 77-81, 2003), and O(dlogn) bits (D. Gavinsky, J. Kempe, and R. de Wolf, |quant-ph/041105ll 
2004). 
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1 Introduction 



Communication complexity was introduced by Yao JH] and has been extensively studied afterward 
not only for its own intriguing problems, but also for its many applications ranging from circuit 
lower bounds to data streaming algorithms. We refer the reader to the monograph JO] for an 
excellent survey. 

We recall some basic concepts below. Let n be an integer and X = Y = {0, l} n . Let / : 
XxF-t {0, 1} be a Boolean function. Consider the scenario where two parties, Alice and Bob, 
who know only x 6 X and y G Y, respectively, communicate interactively with each other to 
compute f(x,y). The deterministic communication complexity of /, denoted by D{f), is defined to 
be the minimum integer k such that there is a protocol for computing / using no more than k bits 
of communication on any pair of inputs. The randomized communication complexity of /, denoted 
by R pub {f), is similarly defined, with the exception that Alice and Bob can use publicly announced 
random bits and that they are required to compute f(x,y) correctly with probability at least 2/3. 
One of the central themes on the classical communication complexity studies is to understand how 
randomness helps in saving the communication cost. A basic finding of Yao 15] is that there are 
functions / such that R(f) = 0(logD(/)). One example is the Equality problem, which simply 
checks whether x = y. 

Later results show that different ways of using randomness result in quite subtle changes on 
communication complexity. A basic finding in this regard, due to Newman is that public-coin 
protocols can save at most O(logra) bits over protocols in which Alice and Bob toss private (and 
independent) coins. The situation is, however, dramatically different in the Simultaneous Message 
Passing (SMP) model, also introduced by Yao ^2], where Alice and Bob each send a message to a 
third person, who then outputs the outcome of the protocol. Apparently, this is a more restricted 
model and for any function, the communication complexity in this model is at least that in the 
general interactive communication model. Denote by R"(f) and i?H ,pub (/) the communication 
complexities in the SMP model with private and public random coins, respectively. It is interesting 
to note that i?i' ,pub (Equality) = 0(1) but R.W (Equality) = G(^/n) 0H2HH- 

Yao also initiated the study of quantum communication complexity JH], where Alice and Bob 
are equipped with quantum computational power and exchange quantum bits. Allowing an error 
probability of no more than 1 /3 in the interactive model, the resulting communication complexity 
is the quantum communication complexity of /, denoted by Q(f). If the two parties are allowed 
to share prior quantum entanglement, the quantum analogy of randomness, the communication 
complexity is denoted by Q*{f)- Similarly, the quantum communication complexities in the SMP 
model are denoted by Q" and Q"'*, depending on whether prior entanglement is shared. The 
following relations among the measures are easy to observe. 



Two very interesting problems in both communication models are the power of quantumness, 



Q*(f) < 
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i.e. determining the biggest gap between quantum and randomized communication complexities, 
and the power of shared entanglement, i.e. determining the biggest gap between quantum com- 
munication complexities with and without shared entanglement. An important result for the first 
problem by Buhrman, Cleve, Watrous and de Wolf is (Equality) = O(logn), an exponential 
saving compared to the randomized counterpart result R" (EQUALITY) = Q(^/n) mentioned above. 
This exponential separation is generalized by Yao showing that i?" ,pub (/) = constant implies 
Q"(/) = O(logn). As an application, Yao considered the Hamming Distance problem defined 
below. For any x,y £ {0, l} n , the Hamming weight of x, denoted by \x\, is the number of l's in x, 
and the Hamming distance of x and y is \x © y\, with "©" being bit-wise XOR. 

Definition 1.1. For 1 < d < n, the d-HAMMlNG Distance problem is to compute the following 
Boolean function HAM n d : {0, l} n x {0, l} n {0, 1}, with HAM(x, y) = 1 if and only if \x®y\ > d. 

Lemma 1.2 (Yao). #H>P ub (HAM M ) = 0(d 2 ). 

In a recent paper Gavinsky, Kempe and de Wolf gave another classical protocol, which is an 
improvement over Yao's when d 3> log n. 

Lemma 1.3 (GKW). R^'P ub (HAM M ) =0(dlogn). 

In this paper, we observe a lower bound for Q*(HAM n ^), which is also a lower bound for 
i?II.P« 6 (HAM nid ) according to Equality ©. 

Notice that HAM(x, y) = n - HAM(x, y), where y = 11 • • • 1 © y. Therefore Q*(RAM n4 ) = 
Q*(HAM 

n,n-d)i an d we need only consider the case d < n/2. 

Proposition 1.4. For any d < n/2, Q*(BAM n4 ) = n(d). 

We then construct a public-coin randomized SMP protocol that almost matches the lower bound 
and improves both of the above protocols. 

Theorem 1.5. #H'P u6 (HAM M ) = 0{d\ogd). 

We shall prove the above two results in the following sections. Finally we discuss open problems 
and a plausible approach for closing the gap. 

Other related work: Ambainis, Gasarch, Srinavasan, and Utis [H] considered the error-free 
communication complexity, and proved that any error-free quantum protocol for the Hamming 
Distance problem requires at least n — 2 qubits of communication in the interactive model, for any 
d< n-l. 
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2 Lower bound of the quantum communication complexity of the 
Hamming Distance problem 

For proving the lower bound, we restrict HAM„ j( j on those pairs of inputs with equal Hamming 

distance. More specifically, for an integer k, 1 < k < n, define = Yj- = f {x : x G {0, l} n , \x\ = k}. 
Let HAM n t k,d '■ Xk x Yj. — > {0, 1} be the restriction of HAM n f f on x Yj.. 

Before proving Proposition II .41 we briefly introduce some related results. Let x, y G {0, l} n . The 
Disjointness problem is to compute the following Boolean function DISJ n : {0, l} n x {0, l} n — > 
{0, 1}, DISJ n (x, y) = 1 if and only if there exists an integer i, 1 < i < n, so that Xi = yi = 1. It is 
known that i?(DISJ n ) = G(n) jlj US, and Q*(DISJ n ) = @(s/n) jH][T]. 

We shall use an important lemma in Razborov^l], which is more general than his remarkable 
lower bound on quantum communication complexity of Disjointness. Here we may abuse the 
notation by viewing x S {0, 1}™ as the set {i E [n] : Xi = 1}. 

Lemma 2.1 (Razborov). Suppose k < n/4 and I < k/A. Let D : [k] — ► {0,1} 6e any Boolean 
predicate such that D(l) ^ D(l — 1). Let fn : k,D '■ X^ x Yk — > {0,1} be such that f n .k,D(x,y) = f 
D(\xny\). ThenQ*(f nAD ) = n(Vkl). 

Proof of Proposition 11.41 Consider D in Lemma P"T1 such that D(t) = 1 if and only if t < I. 
For any x,y £ X k , we have \xf]y\ = k - HAM(x, y)/2. Let / = k - d/2, then k — HAM(x, y)/2 < I 
if and only if HAM(x, y) > d. Therefore, D(\x n y|) = 1 if and only if HAM(x, y) > d. This implies 
that f n ,k,D and HAM,,^^ are actually the same function, and thus Q*{f n ,k,D) = Q*(HAM„ i fc j rf). 

To use lemma ITTl the following two constraints on k and / need to be satisfied: k < n/4 and I < 
k/A. When d < 3n/8, let k = 2d/3 < n/4, then I = 2d/3-d/2 = d/6 < n/16. Both requirements for 
k and Z are satisfied. So applying lemma |2~TT we get Q* (RAM n ^^) = Q*{fn,k,D) = ^(v^O = 0(<i). 

For 3n/8 < d < n/2, it is reduced to the above case (d < 3n/8) rather than lemma l2~Tl Let 
m, = [~8d/5 — 3n/5] . Fix first m bits in x to be all l's, and use x' to denote x m+ \ . . . x n . Similarly, 
fix first m bits of y to be all 0's, and use y' to denote y m +i ■ ■ - Vn- Put n' = n — m, k' = n'/4, and 
df = d-m. Then HAM(x, y) = HAM(x', y') + m and Q*(HAM M )(x, y) > Q*(HAM n , ifc ^,)(x', y')- 
It is easy to verify that d! < 3n'/8 and d! = O(d). Employing the result of the case that d < 3n/8, 
we have Q*(HAM n / )fc / )d /) = O(d'). Thus Q*(HAM n>d ) > Q*(HAM n / )fe / )d /) = O(d') = O(d). ■ 

3 Upper bound of the classical communication complexity of the 
Hamming Distance problem 

To prove theorem II. 5( we reduce the HAM„ ; d problem to HAM 16( p ^ problem by the following 
lemma. 

Lemma 3.1. 

i?ll,P»b(HAM M ) = 0(i?H< pub (HAM 1MV )) + O(dlogd) 
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Note that Theorem ll.5l immediatelv follows from Lemma ki. ll because by Lemma fl.31 fiH ,pub (HAM ni d) 
O(dlogra), thus i?ll' pub (HAM 1M 2 j(i ) = 0(dlogd 2 ) = 0{d\ogd). Now by Lemma E3 we have 
i?ll>P ub (HAM nj d) = 0{d\ogd). So in what follows, we shall prove Lemma 13. 11 Define a partial func- 
tion HAM n rf | 2( i(x, y) with domain {(x, y) : x, y G {0, l} n , \x®y\ is either less than d or at least 2d} 
as follows. 

Tr .„ , . f If HAM(x, y) < d 

KAM n4l2d ( X , y )=i [1 IfHAM ; x ;;;- 2d (2) 

Then 

Lemma 3.2. 

i?ll<P ub (HAM M | 2d ) = 0(1) 

Proof of Lemma 13.21 We revise Yao's protocol ^Zj to design an 0(1) protocol for ~H.AM. nd \ 2d . 
Assume the Hamming distance between x and y is k. Alice and Bob share some random public 
string, which consists of a sequence of 7n(7 is some constant to be determined later) random bits, 
each of which is generated independently with probability p = 1 /{2d) of being 1. Denote this string 
by Z\,z 2 , - ■ ■ ,Zj, each of length n. Party A sends the string a = a±a 2 • • • a 7 to the referee, where 
cii = x- Zi (mod 2). Party B sends the string b = b\b 2 ■ ■ • 6 7 to the referee, where 6, = y-z% (mod 2). 
The referee announces HAM n ^(s, y) = 1 if and only if the Hamming distance between a and b is 
more than m = (1/2 - q)j where q = ((1 - l/d) d + (1 - l/d) 2d )/4. 

Now we prove the above protocol is correct with probability at least 49/50. Let Cj = a\ © 6,. 
Notice that the Hamming distance between a and 6 is the number of l's in c = c\C2 • • • c 7 . We need 
the following Lemma by Yao ^7] 

Lemma 3.3. Assume that the Hamming distance between x and y is k. Given c as defined above, 
each Ci is an independent random variable with probability of being 1, where = 1/2 — 1/2(1 — 
l/d) k . 

Since a& is an increasing function over k, to separate k < d from k > 2d, it would be sufficient 
to discriminate the two cases that k = d and k = 2d. Let be a random variable denoting the 
number of l's in c, and E(Nk) and cr(Nk) denote corresponding expectation and standard deviation, 
respectively. Then we have E(N k ) = a k j, and cr(N k ) < (a k ^) 1 / 2 . Thus E(N 2d ) - E(N d ) = 
l(a 2d - a d ) = ± 7 (1 " 2) d ( l " ( X " 7i) d ) ^ &• Let 7 = 20000, then E(N 2d ) - E(N d ) > 2500, while 
a(N d ),a(N 2d ) < {\-y) l/2 = 100. The cutoff point in the protocol is the middle of E(N d ) and E{N 2d ). 
By Chebyshev Inequility, with probability of at most 1/100, \N d - E{N d )\ > Wa(N d ) = 1000. So 
does ^2^. Thus with probability of at least 49/50, the number of l's in c being more than cutoff 
point implies k > 2d and vice versa. Therefore, 0(7) communication is sufficient to discriminate 
the case HAM(x,y) > 2d and HAM(x,y) < d with error probability of at most 1/50. ■ 

The following fact is also useful 

Fact 1. If 2d balls are randomly thrown into 16d 2 buckets, then with probability of at least 7/8, 
each bucket has at most one ball. 
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Proof of Fact Q3 There are ( 2 2 d ) pairs of balls. The probability of one specific pair of balls falling 
into the same bucket is • j^p • I6d 2 = jj^p- Thus the probability of having a pair of balls in 
the same bucket is upper bounded by • ( 2 2 d ) < 1/8. Thus Fact ^ holds. ■ 
Now we are ready to prove Lemma 13. II 

Proof of Lemma 13.11 If 16c? 2 > n, the O(dlogn) communication protocol in Lemma \l.M would 
also be a O(dlogd) protocol. 

If 16c? 2 < n, suppose we already have a protocol Pi of C communication to distinguish the cases 
\x ffi y\ < d and d < \x ffi y\ < 2c? with error probability at most 1/8. Then we can have a protocol 
of C + 0(1) communication for HAM nj d with error probability at most 1/4. Actually, by repeating 
the protocol for HAM n(J |2d(x, y) several times, we can have a protocol P2 of 0(1) communication 
to distinguish the cases \x © y\ < d and \x © y\ > 2d with error probability at most 1/8. Now 
the whole protocol P is as follows. Alice sends the concatenation of rriA,i and rriA,2, which are her 
messages when she runs Pi and P2, respectively. So does Bob send the concatenation of his two 
corresponding messages 1113,1 and 1713,2- The referee then runs protocol Pj on (mA,i, mB,i) and gets 
the results r^. The referee now announces \x © y\ < d if and only if both r% and T2 say \x © y\ < d. 

It is easy to see that the protocol is correct. If \x © y\ < d, then both protocols announces so 
with probability at least 7/8, and thus P says so with probability at least 3/4. If \x ffi y\ > d, 
then one of the protocols gets the correct range of \x ffi y\ with probability at least 7/8, and thus P 
announces \x ffi y\ > d with probability at least 7/8 too. 

Now it remains to design a protocol of 0(P" ,pub (HAM 1M 2 d )) communication to distinguish 
\x ffi y\ < d and d < \x ffi y\ < 2d. First we assume that n is divisible by 16c? 2 , otherwise we pad 
some O's to the end of x and y. Using the public random bits, Alice divides x randomly into 16c? 2 
parts evenly, Bob also divides y correspondingly. Let Aj,Pj(l < i < 16c? 2 ) denote corresponding 
parts of x, y. By Fact0 with probability at least 7/8, each pair Ai,Bi would contain at most one 
bit on which x and y differ. Therefore, the Hamming distance of Ai and B>i would be either or 
1, i.e, the Hamming distance of Ai and P>i equals the parity of Ai ffi Pj, which is further equal to 
PARITY(^j) ffi PARITY(Pj). Let Oj denote the parity of Ai, bi denote the parity bit of Pj, and 
let a = CL1CL2 ■ ■ ■ ai6d 2 > & = &1&2 • • • ^i6d 2 • Then HAM 16(j 2^(o, b) = HAM„^(s, y) with probability at 
least 7/8. So we run the best protocol for Ham 16(i 2 i(i on the input (a, b), and use the answer to 
distinguish \x ffi y\ < d and d < \x ffi y\ < 2d. m 

4 Discussion 

We conjecture that our quantum lower bound in lemma TTM is tight. It seems plausible to remove 
the O(logff) factor in our upper bound. Recently, Aaronson and Ambainis sharpened the 
upper bound of the Set Disjointness problem from 0(y^n log n) to 0(^/n) using quantum local 
search instead of Grover's search. In their method, it takes only constant communication qubits 
to synchronize two parties and simulate each quantum query. From Yao's protocol JZj, one can 
easily derive an 0(c?logc?) two way interactive quantum communication protocol using quantum 
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counting 5] and the connection between quantum query and communication j7j. Methods similar 
to might help to remove the O(logd) factor in this upper bound. 
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